What grammars tell us about corpora : the case of reduced relative clausesPaola
نویسندگان
چکیده
We present a large (65 million words of Wall Street Journal) and in-depth corpus study of a particular syntactic ambiguity to investigate (1) to what extent the structure of a grammar is reeected in a corpus, and (2) how probability functions deened according to a grammar t independently established measures of syntactic disambiguation preference. We look at the well-known case of the ambiguity between a main clause and reduced relative construction. We measure the probability distributions of several linguistic features (transitiv-ity, tense, voice) over a sample of optionally intransitive verbs. In agreement with recent results on parsing with lexicalised probabilistic grammars (Collins, 1997; Srinivas, 1997), we nd that statistics over lexical, as opposed to structural, features best correspond to human intuitive judgments and to experimental nd-ings. These results are enlightening to investigate novel uses of corpora, by assessing the portability of statistics across tasks, and by determining what is needed for useful syntactic annotation of corpora.
منابع مشابه
What grammars tell us about corpora: the case of reduced relative clauses
We present a large (65 million words of Wall Street Journal) and in-depth corpus study of a particular syntactic ambiguity to investigate (1) to what extent the structure of a grammar is reflected in a corpus, and (2)how probability flmctions defined according to a grammar fit independently established measures of syntactic disambiguation preference. We look at the well-known case of the ambigu...
متن کاملThe Natural Rights of Children
What does libertarian theory, Murray Rothbard’s theory in particular, tell us about the rights of children? The two foundational principles of Rothbardian libertarianism are the sanctity of private property and the rule of non-aggression. Persons, including children, are “self-owners”. Yet children, at a young age, are not yet capable of functioning fully as “self-owners.” They must be cared fo...
متن کاملAncient Diet Reconstruction: A Case Study of Sidon, Lebanon
The present work is associated with dietary reconstruction using δ13C and δ15N analysis of humans from the site of Sidon, a Middle Bronze Age (2000BC-1550BC) settlement in Lebanon. The main objective of this research is to focus on collagen extraction of 23 individual bones, discovered in a cemetery, College site (season 2001-2002) in ancient Sidon. Collagen could only be extracted from the 8 a...
متن کاملFOXP2 in focus: what can genes tell us about speech and language?
The human capacity for acquiring speech and language must derive, at least in part, from the genome. In 2001, a study described the first case of a gene, FOXP2, which is thought to be implicated in our ability to acquire spoken language. In the present article, we discuss how this gene was discovered, what it might do, how it relates to other genes, and what it could tell us about the nature of...
متن کاملComments on Eddies, Mixing and the Large-Scale Ocean Circulation
Some recent laboratory and numerical studies of idealized ocean gyres and circumpolar currents are reviewed in the context of what they might tell us about the relative role of eddy transfer and mixing in setting the structure of the large-scale ocean circulation.
متن کامل